Method of processing an image to form an image pyramid

ABSTRACT

A method of processing an image to form an image pyramid having multiple image levels includes receiving a base level image comprising pixel values at pixel locations arranged in rows and columns; determining sample locations for a next level image in the pyramid such that the sample locations are arranged in a regular pattern and the sample locations exceed the range of the pixel locations of the base level image; determining the pixel values of the next level image by interpolating the pixel values of the base level image using an interpolation filter at the sample locations; and treating the next level image as the base level image and repeating steps of determining sample locations and pixel values until a predetermined number of pyramid image levels are generated, or until a predetermined condition is met.

FIELD OF INVENTION

[0001] The present invention relates to an improved method of generating an image pyramid and to a method of spatially filtering a digital image using the image pyramid.

BACKGROUND OF THE INVENTION

[0002] It is well known that the dynamic range of an image captured with an image capture device (such as a photographic negative) is often greater than the dynamic range of the output medium (such as a photographic paper or CRT monitor). The result of this incongruity is that a good deal of scene content is lost in the output image. For this reason, in an image processing environment, a tone scale function may be used to reduce the scene dynamic range in order to map more information onto the output medium, in a process called dynamic range modification or dynamic range compression. The dynamic range compression modifies the tone scale characteristics of the image.

[0003] There exist many processes for creating a tone scale function on an image dependent basis; e.g. see U.S. Pat. No. 5,471,987 issued Dec. 5, 1995 to Nakazawa et al. Each of the conventional tone scale function processes examines certain statistical characteristics of the image under consideration in order to automatically generate the tone scale function. In addition, the tone scale function may be generated with manual interactive tools by a human operator.

[0004] After the tone scale function has been generated, there exists the question of how to apply the tone scale function to the digital image. The goal of dynamic range compression is to adjust the overall dynamic range of the image, rather than to affect the contrast of any given object in the image. In essence, a tone scale function should be applied to an image in such a way as to minimize the effect to the scene texture. To that end, it is known to apply a tone scale function to a low frequency sub-band of the image, preserving the higher frequency sub-band(s) that are considered image texture; e.g. see U.S. Pat. No. 5,012,333 issued Apr. 30, 1991 to Lee et al.

[0005] Lee et al. describe a procedure for preserving the high frequency detail of an image by blurring the image neutral channel in order to create a low-pass signal. Subtracting the low-pass signal from the image neutral channel produces a high-pass signal. The processed image is generated by applying the tone scale function to the low-pass signal and adding the result to the high-pass signal. This procedure preserves a segment of the image frequency spectrum; however, artifacts are generated at object boundaries in the image. Gallagher et al. build on this work; see U.S. Pat. No. 6,317,521 issued Nov. 13, 2001. More specifically, Gallagher et al. incorporate an artifact avoidance scheme along with a single standard FIR filter to generate the texture signal. While this improvement reduces the occurrence of artifacts in the final image, the artifacts can still be visible.

[0006] Several methods for achieving dynamic range modification of an image by decomposing the image into multiple resolutions have been proposed. For example, in U.S. Pat. No. 5,467,404 issued Nov. 14, 1995, and U.S. Pat. No. 5,805,721 issued Sep. 8, 1988, Vuylsteke et al. teach a method of decomposing an image into a pyramid having multiple resolution versions of the image and using a pre-determined nonlinear amplitude compression function for the high frequency component in each resolution. A deficiency of this method is that the amplitude at each resolution does not adequately identify whether the signal is part of a large amplitude edge or an image texture. A similar approach was disclosed in U.S. Pat. No. 5,717,791 issued Feb. 10, 1998 to Labaere et al., which describes a similar dynamic range compression scheme using an image pyramid generated using wavelet filters to generate the multiple resolutions.

[0007] In U.S. Pat. No. 5,907,642 issued May 25, 1999, Ito describes a method of image enhancement based on processing the detail signals of an image pyramid. Ito describes suppressing the magnitude of detail signals in situations where the next lower detail signal has small magnitude. In U.S. Pat. No. 5,991,457 issued Nov. 23, 1999, Ito describes a method of generating several band pass detail image signals that are modified by application of non-linear functions to modify the dynamic range of the image.

[0008] In U.S. Pat. No. 6,285,798 B1 issued Sep. 4, 2001, Lee describes yet another dynamic range compression method using a pyramid representation of an image. Lee describes a method of using wavelet filters to create a plurality of coarse signals and detail signals, modifying the detail signals in accordance with contrast gain signals created by detecting the edges of the coarse scale edges, and adding the modified detail signals to the coarse signals to obtain an output image.

[0009] In each of these dynamic range compression techniques using a pyramid image representation, the high frequency (e.g. edge or band pass) components of the pyramid image representation are modified to affect the image dynamic range. However, it is often inconvenient to operate on the high frequency component of the pyramid image representation. In addition, the characteristics of the high frequency signals vary as a function of the level within the pyramid image representation. This variability requires a complicated parameter tuning in order to achieve optimal dynamic range compression without producing objectionable artifacts (such as the aforementioned overshoot and undershoot artifact) using a pyramid representation of the image.

[0010] Pyramid methods as a means of representing images as a function of spatial resolution for image processing, has a long history. Burt and Adelson, described a method of representing a digital image by a series of residual images and a base digital image in their journal article “The Laplacian Pyramid as a Compact Image Code” IEEE Transactions on Communications, Vol. Com-31, No. 4, Apr. 1983. However the method taught by Burt et al. was designed for image compression applications and cannot be used for enhancing the tone scale of a digital image.

[0011] The prior art methods described for handing the borders of an image pyramid are inefficient. Vuylsteke, LaBaere, and Ito each describe generating lower resolution base image pyramid levels by filtering then sampling every other pixel. This method throws away image information whenever the image has an even number of rows, as the samples of the lower resolution base image span (cover) all but one of the image rows. In addition, the sampling method does not preserve phase between the resolution levels. For example, consider an 8×8 pixel image. Omitting the blurring filter, after 4 pyramid levels have been generated, the image is represented as a single pixel. The value of the single pixel is the same as the pixel of the 8×8 image in the upper left position (assuming sampling of every other pixel.) Thus a phase shift of several pixels has occurred. Because of this phase shift, the pyramids used by Vuylsteke, Labaere and Ito are sub-optimal.

[0012] In addition, Gendel describes an image pyramid in U.S. Pat. No. 6,141,459, issued Oct. 31, 2000. Gendel describes a method of forming an image pyramid to handle the image borders so that a smoothing filter will provide valid results. However, Gendel's solution of padding the image is inefficient and the method does not ensure that image information is not lost due to the span of the lower resolution base image being smaller than the starting base image.

[0013] Therefore, there exists a need for an improved method of forming image pyramids and in particular an improved method for processing image pyramid borders. Specifically, there is a need to improve the method of processing image pyramid borders when using an image pyramid to modify the tone scale of a digital image.

SUMMARY OF THE INVENTION

[0014] The need is met according to the present invention by providing a method of processing an image to form an image pyramid having multiple image levels that includes receiving a base level image comprising pixel values at pixel locations arranged in rows and columns; determining sample locations for a next level image in the pyramid such that the sample locations are arranged in a regular pattern and the sample locations exceed the range of the pixel locations of the base level image; determining the pixel values of the next level image by interpolating the pixel values of the base level image using an interpolation filter at the sample locations; and treating the next level image as the base level image and repeating steps of determining sample locations and pixel values until a predetermined number of pyramid image levels are generated, or until a predetermined condition is met.

Advantages

[0015] The present invention has the advantages of computational efficiency and ensures that image information is not lost due to the span of the lower resolution base image being smaller than the starting base image.

BRIEF DESCRIPTION OF THE DRAWINGS

[0016]FIG. 1 is a block diagram of a computer system suitable for practicing the present invention;

[0017]FIG. 2 is a block diagram of the digital image processor of FIG. 1 according to the present invention;

[0018]FIG. 3 is a block diagram of the luminance enhancer shown in FIG. 2;

[0019]FIG. 4 is a block diagram of a preferred pedestal image generator shown in FIG. 3;

[0020]FIG. 5 is a block diagram of the pyramid constructor shown in FIG. 4;

[0021]FIG. 6 is a block diagram of the pyramid level module shown in FIG. 5;

[0022]FIG. 7 is an illustration of the pixel locations in a base image level relative to the pixel locations of a lower resolution base image level;

[0023]FIG. 8 is a block diagram of an illustrative down sampler, shown in FIG. 6;

[0024]FIG. 9 is a block diagram of one iteration of the pedestal reconstructor shown in FIG. 4; and

[0025]FIG. 10 is a block diagram of the luma control signal generator shown in FIG. 9.

DETAILED DESCRIPTION OF THE INVENTION

[0026] In the following description, a preferred embodiment of the present invention will be described as a software program. Those skilled in the art will readily recognize that the equivalent of such software may also be constructed in hardware. Because image manipulation algorithms and systems are well known, the present description will be directed in particular to algorithms and systems forming part of, or cooperating more directly with, the method in accordance with the present invention. Other aspects of such algorithms and systems, and hardware and/or software for producing and otherwise processing the image signals involved therewith, not specifically shown or described herein may be selected from such systems, algorithms, components, and elements known in the art. Given the description as set forth in the following specification, all software implementation thereof is conventional and within the ordinary skill in such arts.

[0027] The present invention may be implemented in computer hardware. Referring to FIG. 1, the following description relates to a digital imaging system which includes an image capture device 10, a digital image processor 20, an image output device 30, and a general control computer 40. The system can include a display device 50 such as a computer console or paper printer. The system can also include an input control device 60 for an operator such as a keyboard and or mouse pointer. The present invention can be used on multiple capture devices 10 that produce digital images. For example, FIG. 1 can represent a digital photofinishing system where the image capture device 10 is a conventional photographic film camera for capturing a scene on color negative or reversal film, and a film scanner device for scanning the developed image on the film and producing a digital image. The digital image processor 20 provides the means for processing the digital images to produce pleasing looking images on the intended output device or media. The present invention can be used with a variety of output devices 30 that can include, but are not limited to, a digital photographic printer and soft copy display. The digital image processor 20 can be used to process digital images to make adjustments for overall brightness, tone scale, image structure, etc. of digital images in a manner such that a pleasing looking image is produced by an image output device 30. Those skilled in the art will recognize that the present invention is not limited to just these mentioned image processing functions.

[0028] The general control computer 40 shown in FIG. 1 can store the present invention as a computer program stored in a computer readable storage medium, which may comprise, for example: magnetic storage media such as a magnetic disk (such as a floppy disk) or magnetic tape; optical storage media such as an optical disc, optical tape, or machine readable bar code; solid state electronic storage devices such as random access memory (RAM), or read only memory (ROM). The associated computer program implementation of the present invention may also be stored on any other physical device or medium employed to store a computer program indicated by offline memory device 70. Before describing the present invention, it facilitates understanding to note that the present invention is preferably utilized on any well-known computer system, such as a personal computer.

[0029] It should also be noted that the present invention can be implemented in a combination of software and/or hardware and is not limited to devices which are physically connected and/or located within the same physical location. One or more of the devices illustrated in FIG. 1 may be located remotely and may be connected via a wireless connection.

[0030] A digital image is comprised of one or more digital image channels. Each digital image channel is comprised of a two-dimensional array of pixels. Each pixel value relates to the amount of light received by the imaging capture device corresponding to the physical region of pixel. For color imaging applications, a digital image will often consist of red, green, and blue digital image channels. Motion imaging applications can be thought of as a sequence of digital images. Those skilled in the art will recognize that the present invention can be applied to, but is not limited to, a digital image channel for any of the above mentioned applications. Although a digital image channel is described as a two dimensional array of pixel values arranged by rows and columns, those skilled in the art will recognize that the present invention can be applied to non rectilinear arrays with equal effect. Those skilled in the art will also recognize that for digital image processing steps described hereinbelow as replacing original pixel values with processed pixel values is functionally equivalent to describing the same processing steps as generating a new digital image with the processed pixel values while retaining the original pixel values.

[0031] There are many different types of tone scale functions that can be applied to digital images for enhancement purposes. Some digital images are derived from original scenes photographed that have a high dynamic range of intensities present. In general, it is difficult to make pleasing prints from these high dynamic range digital images since the range of pixel values is so large. For a typical high dynamic range digital image, the image content in the highlight regions (bright portions) and shadow regions (dark portions) will often be rendered without detail since photographic paper can only reproduce faithfully a limited range of intensities. Therefore, a compressive tone scale function, i.e. a tone scale function designed to compress, or reduce, the dynamic range of a digital image, can be applied to a high dynamic range digital image to reduce the numerical range of pixel values. This processed digital image when printed, will reproduce more spatial detail in the highlight and shadow regions than if the tone scale function had not been applied. Unfortunately, the application of a compressive tone scale function can also compress, or reduce the magnitude of, the fine spatial detail of the image content. Therefore, the processed images with the direct application of a tone scale function can result in dull uninteresting images.

[0032] According to one aspect of the present invention a spatial filter is used to apply a tone scale function to a digital image. The spatial filter is used to separate an original digital image into first and second signals: a pedestal signal and a texture signal. The texture signal contains image content that relates to edges and fine spatial detail. A tone scale function is applied to the pedestal signal. Since the pedestal signal does not contain fine spatial detail, but rather low frequency smoothly varying regions and edges from illumination changes, the application of the tone scale function to the pedestal signal does not reduce the magnitude of the fine spatial detail. The fine spatial detail is preserved in the texture signal, which is recombined with the processed pedestal part. The resulting process achieves the goal of reducing the overall dynamic range of the image to be within the printable range for the photographic paper (or other output medium, such as a CRT monitor) but doesn't reduce the magnitude of fine detail in the processed image.

[0033] The digital image processor 20 shown in FIG. 1 and programmed to perform one aspect of the present invention is illustrated in more detail in FIG. 2. An original digital image 101 can be received from the image capture device (shown in FIG. 1) in a variety of different color representations. However, the most typical implementation of the present invention receives the original digital image as a color digital image with red, green, and blue digital image channels. Preferably, the pixel values of the original digital image are related to the log of the scene intensity and each pixel value of each color channel is represented as a 12-bit value. Preferably, every 188 code values represents a doubling of scene intensity (i.e. a photographic stop). For example, a first pixel having a value of 1688 represents a scene intensity that is twice as great as a second pixel having a value of 1500. The present invention can operate successfully with other encodings, although modification to equation constants and shapes of functions may be required.

[0034] An LCC conversion module 210 receives the original digital image 101 and generates a luminance digital image 107 (containing luminance information in a single digital image channel) and a chrominance digital image 109 (containing the color information in two color-difference digital image channels). The luminance digital image 107 is input to the luminance enhancer 240 for the purpose of creating an enhanced luminance digital image 113. The chrominance digital image 109 is input to the chroma control signal generator 112 for the purpose of creating a chroma control signal 114, which will be used to modify the effect of a spatial filter 116 within the luminance enhancer 240. The tone scale generator 230 inputs the original digital image 101 and analyzes the image, outputting a tone scale function 203 that is used by the luminance enhancer to improve the tone scale of the digital image. The chrominance digital image 109 and enhanced luminance digital image 113 are received by the RGB conversion module 220 which performs a color transformation and generates the enhanced digital image 102 (containing red, green, and blue digital image channels) which is in the same color representation as the original digital image 101. The enhanced luminance digital image 113 is produced from the luminance digital image 107 and the tone scale function 203.

[0035] The LCC module 210 shown in FIG. 2 preferably employs a 3 element by 3 element matrix transformation to convert the red, green, and blue pixel values of the original digital image 101 into luminance and chrominance pixel values. Let R(x,y), G(x,y), and B(x,y) refer to the pixel values corresponding to the red, green, and blue digital image channels located at the x^(th) row and y^(th) column. Let L(x,y), GM(x,y), and ILL(x,y) refer to the transformed luminance, first chrominance, and second chrominance pixel values respectively of an LCC original digital image. The 3 element by 3 elements of the matrix transformation are described by (1).

L(x,y)=0.333R(x,y)+0.333G(x,y)+0.333B(x,y)GM(x,y)=−0.25R(x,y)+0.50G(x,y)−0.25B(x,y)ILL(x,y)=−0.50R(x,y)+0.50B(x,y)  (1)

[0036] Those skilled in the art will recognize that the exact values used for coefficients in the luminance/chrominance matrix transformation may be altered and still yield substantially the same effect.

[0037] The collection of luminance pixel values is the single-channel luminance digital image. The chrominance digital image has two channels, the green-magenta channel (whose values are GM(x,y) ) and the illuminant channel (whose values are ILL(x,y)). The luminance digital image is made up of luminance pixel values and the chrominance digital image is made up of chrominance pixel values.

[0038] The RGB conversion module 220 shown in FIG. 2 employs a 3 element by 3 element matrix transformation to convert the luminance and chrominance pixel values into red, green, and blue pixel values by performing the inverse matrix operation to the LCC module 210. The matrix elements of the RGB module are given by (2).

R(x,y)=L(x,y)−0.666GM(x,y)−ILL(x,y)G(x,y)=L(x,y)+1.333GM(x,y)B(x,y)=L(x,y)−0.666GM(x,y)+ILL(x,y)  (2)

[0039] The tone scale function generator 230 (shown in FIG. 2) preferably generates the tone scale function by an analysis of the original digital image 101. Preferably the tone scale function generator 230 incorporates the method described by Lee et al. in U.S. Pat. No. 5,822,453, issued Oct. 13, 1998, which is incorporated herein by reference, to calculate and output the tone scale function 203.

[0040] The present invention can also be used with tone scale functions that are not derived from an analysis of the original digital image 101, i.e. scene independent tone scale functions. For example, a linear tone scale function constructed as T₅(x)=0.6 (x−x_(r))+x_(r) has been implemented and used as the tone scale function 203 yielding excellent image enhancement results. This tone scale function achieves a dynamic range compression effect due the linear equation having a slope of less than 1.0.

[0041] The luminance enhancer 240 is illustrated in more detail in FIG. 3. The luminance digital image 107 is input to the pedestal generator 120 for producing a pedestal signal 122. The pedestal image 122 is essentially identical to the luminance digital image 107, with the exception that image texture is removed (i.e. texture that would be damaged if affected by the tone scale function). Ideally, the pedestal image 122 is smooth with sharp transitions corresponding to large lighting edges (such as the transition from a bright sky to a backlit mountain, or the edge of a high contrast shadow) in the luminance digital image. Preferably, the pedestal signal p(x,y) is made up of the same number of rows and columns of pixels as the luminance digital image 107. The pedestal generator 120 will be described in greater detail hereinbelow.

[0042] The pedestal image 122 and the luminance digital image 107 are input to a texture generator 128 for producing a texture image 130. The texture image contains the image texture whose magnitude will be unaffected by the tone scale function. The texture generator 128 generates the texture image 130 according to the following equation:

t(x,y)=L(x,y)−p(x,y)  (3)

[0043] where:

[0044] L(x,y) represents the value of the pixel of the luminance digital image at the (x,y) location.

[0045] p(x,y) represents the value of the pedestal image at the (x,y) location.

[0046] t(x,y) represents the value of the texture image at the (x,y) location.

[0047] Note that the sum of the pedestal and the texture images is the luminance digital image.

[0048] The pedestal image is input to the tone scale function applicator 124, which produces the modified pedestal image pm(x,y). The tone scale function applicator 124 produces the modified pedestal image 126 according to the equation:

pm(x,y)=T[p(x,y)]  (4)

[0049] where:

[0050] pm(x,y) represents the value of the (x,y) the pixel of the modified pedestal image pm(x,y).

[0051] T[x] represents the value of the tone scale function 203 for an input value of x. Those skilled in the art will recognize that the tone scale applicator 124 simply applies a look-up-table (LUT) to the pedestal signal, producing the modified pedestal image 126.

[0052] The modified pedestal image 126 and the texture image 130 are then added by an adder 132, producing the enhanced luminance digital image 113. The adder 132 generates the enhanced luminance digital image simply by summing the pixel values of the texture image and the modified pedestal image 126, according to the equation:

Le(x,y)=pm(x,y)+t(x,y)  (5)

[0053] Where:

[0054] Le(x,y) represents the value of the pixel of the enhanced luminance digital image 113 at the (x,y) location.

[0055] pm(x,y) represents the value of the (x,y) the pixel of the modified pedestal image 126 pm(x,y).

[0056] t(x,y) represents the value of the (x,y) the pixel of the texture image 130 t(x,y).

[0057] The preferred embodiment of the pedestal generator 120 is shown in FIG. 4, where a pyramid representation of the image is used so that the produced pedestal image 122 is a result of removing texture information at many different scales. The luminance digital image 107 is input to the pyramid constructor 156, which outputs an image pyramid representation 108. The image pyramid representation 108 contains all of the information that is contained in the luminance digital image 107, and the image pyramid representation 108 can be easily converted back to the luminance digital image 107. The image pyramid representation 108 includes several image signals, including the base digital image, which is essentially a smaller (fewer pixels) version of the luminance digital image 107, and residual images, which contain highpass information from different scales (i.e. the residual images contain bandpass information.) The pyramid representation 108 is input to the pedestal reconstructor 158, which combines the image signals of the image pyramid representation 108 in such a manner that the texture information is removed, forming the pedestal image 122.

[0058] The pyramid constructor 156 according to one aspect of the invention is illustrated in more detail in FIG. 5. A digital image 164 is input to the pyramid level module 115 ₁. The pyramid level module 115 produces a first base digital image 103 ₁ and residual digital image 104 ₁. The residual digital image preferably has the same number of rows and columns of pixels as the digital image 164. The number of rows and columns of the base image 103 ₁ is related to (and fewer than) the number of rows and columns of pixels of the digital image 164 and the decimation factor Q. Preferably Q=2. The base image level 103 ₁ produced by the first pyramid level module 115 ₁ is input to the second pyramid level module 115 ₂, producing a second base image level 103 ₂ and a second residual image 104 ₂. The process iteratively continues N times, when the final pyramid level module 115 _(N) inputs the N−1^(th) base image level and produces the N^(th) base image level 103 _(N) and the N^(th) residual image 104 _(N). The digital image is also a base image level, so the pyramid level module always inputs a digital image. The digital image input to the first pyramid level module can be considered the 0^(th) base image level.

[0059]FIG. 6 fully illustrates the pyramid level module 115. The n^(th) base image level 103 _(n) is input to a down sampler 170 for producing the n+1^(th) base image level 103 _(n+1) The down sampler 170 produces a new digital image level whereby each pixel value of the new digital image level is formed by performing a weighted average (e.g. with a convolution filter) of the pixel values of the n^(th) base image level 103 _(n). Preferably, the pixel values of the n+1^(th) base image level 103 _(n+1) are determined as the mean pixel value of corresponding Q×Q non-overlapping blocks in the n^(th) base image level 103 _(n). This technique for reducing the resolution of an image by an integer factor is well known in the art of image processing, as is applying a lowpass convolution filter followed by a sampling operation.

[0060] Proper handling of the border pixels in the pyramid constructor 156 is quite important, for reasons of computational efficiency as well as image quality. Padding the base image level is inefficient because new memory must be allocated. The down sampler 170 generates a lower resolution base image level 103 _(n+1) from an input base image level 103 _(n). Let I₀(x₀,y₀) represent the pixel values of the input base image level 103 _(n) with X₀ rows and Y₀ columns of pixels, where x₀ is an integer over the range from 0 to X₀−1, and y₀ is an integer ranging from 0 to Y₀−1. Also, let I₁(xi,yj) represent the pixel values of the n+1^(th) base image level ¹⁰³ _(n+1) with X₁ rows and Y₁ columns of pixels, where x₁ is an integer over the range from 0 to X₁−1, and y₁ is an integer ranging from 0 to Y₁−1.

[0061] The lower resolution base image level ¹⁰³ _(n+1) generally has fewer pixels in both the horizontal and vertical directions than the base image level 103 _(n). However, the area of the scene corresponding to each pixel is larger. In addition, it is preferable that the span of (scene area covered by) the lower resolution base image level 103 _(n+1) be at least as large as that of the base image level 103 _(n) to avoid the need to extrapolate beyond the range of the pixel location during the process executed by the interpolator 172.

[0062] The operation of the down sampler 170 can be thought of as finding the values of the lower resolution base image level 103 _(n+1) I₁(x₁,y₁) based on the values of the base image level 103 _(n) I₀(x₀,y₀). This is accomplished with the following steps:

[0063] A. The pixel locations (x₁,y₁) of the lower resolution base image level 103 _(n+1) are mapped back to sample locations ({tilde over (x)}₀,{tilde over (y)}₀) in the base image level 103 _(n). Typically, sample locations ({tilde over (x)}₀,{tilde over (y)}₀) will not correspond to an exact integer location, but will fall between pixel sites.

[0064] Preferably, the mapping is given as:

{tilde over (x)} ₀ =x ₁ Q+offset {tilde over (y)} ₀ =y ₁ Q+offset  (6)

[0065] The sample locations ({tilde over (x)}₀,{tilde over (y)}₀) are arranged in a regular pattern (a grid with distance of Q between adjacent samples). The minimum value of {tilde over (x)}₀ is not greater than zero and the maximum value of {tilde over (x)}₀ is not less than X₀−1. Likewise, the minimum value of {tilde over (y)}₀ is not greater than zero and the maximum value of {tilde over (y)}₀ is not less than Y₀−1. Thus, the sample locations ({tilde over (x)}₀,{tilde over (y)}₀) span the entire range of the pixel locations of the base image level 103 _(n).

[0066] Preferably, Q is an integer indicating the distance between samples in the base image level 103 n. The offset in Eq. (6) is used to ensure that the span of the mapped pixel locations ({tilde over (x)}₀,{tilde over (y)}₀) is equal to or greater than the span of the pixel locations of the base image level 103n. In addition, the offset ensures that phase is preserved between resolution levels. The span is the distance between the maximum and the minimum for each image dimension. Preferably, offset=0 when Q is odd and offset=−½ when Q is even.

[0067] As stated before, the lower resolution base image level 103 _(n+1) has X₁ rows and Y₁ columns of pixels. The value X₁ is found such that when the value X₁−1 is mapped back according to Eq. (6), it is at least as great as the value of X₀−1. A similar procedure can be used to determine Y₁.

[0068] Considering the preferred case where Q=2. The base image level 103 _(n) has X₀ rows. If X₀ is even, then the lower resolution base image level 103 _(n+1) will have X₁=X₀/2+1 rows, according to the technique described above. If X₀ is odd, then the lower resolution base image level 103 _(n+1) will have X₁=(X₀+1)/2+1 rows. The number of columns in the lower resolution base image can similarly be determined. FIG. 7 shows an illustration of the pixel locations of the base image level 103 _(n), and the sample locations.

[0069] More generally, when Q is even, if Q evenly divides into X₀, then the lower resolution base image will have X₁=X₀/Q+1 rows. Otherwise, the lower resolution base image level will have X₁=int(X₀/Q)+2 rows, where the int( ) function returns the largest integer not greater than the argument.

[0070] When Q is odd, if Q evenly divides into X₀−1, then the lower resolution base image will have X₁=(X₀−1)/Q+1 rows. Otherwise, the lower resolution base image level will have X₁=int((X₀−1)/Q)+2 rows.

[0071] B. The value of the pixel I₁(x₁,y₁) is determined by interpolating between the pixel values nearby I₀({tilde over (x)}₀,{tilde over (y)}₀) (the sample location). This type of interpolation is well known in the art of image processing and can be accomplished by nearest neighbor interpolation, bilinear interpolation, bicubic interpolation, or any number of other interpolation methods. This interpolation can be expressed as a convolution operation, according to the following equation: $\begin{matrix} {{I_{1}\left( {x_{1},y_{1}} \right)} = {\sum\limits_{\underset{j = {- \infty}}{i = {- \infty}}}^{+ \infty}{{f_{d}\left( {i,j} \right)}{I_{0}\left( {{{\overset{\sim}{x}}_{0} - i},{{\overset{\sim}{y}}_{0} - j}} \right)}}}} & (7) \end{matrix}$

[0072] where f_(d)(i,j) is the smoothing filter. Note that I₀ is nonzero only when indices are integers. The shape and size of the smoothing filter can depend on the decimation factor Q. In the preferred case where Q=2, the preferred filter is defined as:

f _(d)(i,j)=1−|i|−|j|+|ij| for |i|, |j|<1, and 0 elsewhere.  (8)

[0073] This is a bilinear filter commonly known in the art. When Q=2 and the mapping of Eq. (6) is used, then the smoothing filter performs block averaging, where each pixel in the lower resolution base image 103 _(n+1) is created by averaging 2×2 pixel blocks of the base image level 103 _(n).

[0074] Border conditions must be handled when the indices of Eq. (7) are outside of the range of the image I₀(x₀,y₀). The preferred method of determining a value for equation when the indices of the image are outside of the span of the image is to use mirroring, described in Eq. (9).

[0075] By definition,

I ₀(m,n)=I ₀(−m,n) when m<0I ₀(m,n)=I ₀(2X ₀ −m−2, n) when m>X ₀−1  (9)

[0076] Although Eq. (9) shows mirroring in only the rows of the image, the mirroring is handled in an identical fashion for the image columns. Those skilled in the art will recognize that many obvious mirroring variations can by devised.

[0077]FIG. 7 shows an example of the pixel and sample locations for an illustrative base image level having 5 rows and 6 columns. The pixel locations of the base image level 103 _(n) are indicated by small circles. The lower resolution base image level 103 _(n+1) has 4 rows and 4 columns. The sample locations are found with Eq. (6), and are indicated with x's in FIG. 7. The interpolated values found at these sampling locations make up the 4 rows and 4 columns of the lower resolution base image level 103 _(n+1). Notice that the span of the lower resolution base image level 103 _(n+1) is larger than that of the base image level 103 _(n).

[0078] The operation of the down sampler 170 can be described in another fashion, as shown in FIG. 8, which aids in understanding but is substantially less efficient. A border of at least int(Q/2) pixels is circumscribed about the base image level 103 _(n) by chroma control signal generator 112. When Q is even and Q is not able to be evenly divided into X₀, then extra rows are added (after the X₀ ^(th) row). A similar procedure is executed for the columns. The values of the border pixels are determined through mirroring (see Eq. (9)). This results in a padded base image level 105 having a number or rows and columns evenly divisible by Q. The filtering and decimation is then applied to padded base image level 105 with the block averager 114. For example, when Q=2, block averaging with a 2×2 block is performed in the padded base image level 105 _(n) to generate the lower resolution base image level 103 _(n+1).

[0079] As before, the lower resolution base image level 103 _(n+1) has int((X₀+1)/2)+1 rows and int((Y₀+1)/2)+1 columns. The border is added so that the span of the lower resolution base image level will be at least as large as the span of the base image level 103 _(n). If the span of the lower resolution base image level is smaller than the base image level, then information is lost for that and subsequent pyramid levels. This causes a deleterious effect to the resulting image when the residual images 104 are modified and cannot compensate for the lost information.

[0080] The output of the down sampler 170 is the base image level n+1 103 _(n+1), which is also an output of the pyramid level module 115. In the process of generating the reduced resolution base image level 103 _(n+1), highpass information is discarded. The residual signal contains the discarded highpass information such that the base image level 103 _(n+1) and the residual image 104 _(n+1) can be combined to form the base image level 103 _(n). To produce the residual image 104 _(n+1), the base image level 103 _(n+1) output from the down sampler 170 is passed to an interpolator 172 for interpolation by the factor Q. Preferably, the interpolator 172 performs a standard bilinear interpolation by a factor of two. The preferred bilinear interpolation is such that the phase of the interpolated image is that same as that of the base image level 103 _(n). The output of the interpolator 172 is the interpolated base image level 174.

[0081] The interpolated base image level 174 has the same number of rows and columns of pixels as the base image level 103 _(n). Let I_(B0)(x₀,y₀) represent the pixel values of the interpolated base image level 174 where x₀ is an integer over the range from 0 to X₀−1, and y₀ is an integer ranging from 0 to Y₀−1. The operation of the interpolator 172 can be thought of as finding the values of I_(B0)(x₀,y₀) based on the values of I₁(x₁,y₁). This is accomplished with the following steps:

[0082] A. The pixel locations (x₀,y₀) of the interpolated base image level 174 are mapped to locations ({tilde over (x)}₁,{tilde over (y)}₁) in the lower resolution base image level 103 _(n+1). Typically, ({tilde over (x)}₁,{tilde over (y)}₁) will not correspond to an exact integer location, but will fall between pixel sites.

[0083] Preferably, the mapping is given as: $\begin{matrix} {{{\overset{\sim}{x}}_{1} = {\frac{x_{0}}{Q} + {{offset}{\quad \quad}2}}}{{\overset{\sim}{y}}_{1} = {\frac{y_{0}}{Q} + {{offset}{\quad \quad}2}}}} & (10) \end{matrix}$

[0084] Q is an integer that indicates the distance between samples in the base image level 103n, in terms of the interpolated base image level 174.

[0085] Preferably, offset2=0 when Q is odd and offset2=½ Q when Q is even.

[0086] B. Interpolation is again performed to find the value of I_(B0)(x₀,y₀).

[0087] Preferably, $\begin{matrix} {{I_{B0}\left( {x_{0},y_{0}} \right)} = {\sum\limits_{\underset{j = {- \infty}}{i = {- \infty}}}^{+ \infty}{{f_{d}\left( {i,j} \right)}{I_{1}\left( {{{\overset{\sim}{x}}_{1} - i},{{\overset{\sim}{y}}_{1} - j}} \right)}}}} & (11) \end{matrix}$

[0088] The preferred filter f_(d) is given in Eq. (8). When Eq. (11) is used, the values of I_(B0)(x₀,y₀) are found with bilinear interpolation. In the preferred case where Q=2, the weights on the pixel values of the lower resolution base image I₁(x₁,y₁) are always either {fraction (9/16)}, {fraction (3/16)}, or {fraction (1/16)}.

[0089] Using the preferred parameters, the interpolator never needs to perform mirroring, since the integer values of the indices of I₁({tilde over (x)}₁−i, {tilde over (y)}₁−j) are always within the range of 0 to X₀−1 for the row index and 0 to Y₁−1 for the column index. This is a direct result that in the operation of the preferred down sampler 170 the span of the low resolution base image level 103 _(n+1) is at least as large as the span of base image level 103 _(n).

[0090] The interpolated base image level 174 is input to the differencer 176. The differencer 176 calculates the residual image 104 _(n+1) by subtracting the interpolated base image level 174 from the base image level 103 _(n), according to the equation:

r(x,y)=b(x,y)−bi(x,y)  (12)

[0091] where:

[0092] r(x,y) represents the value of the pixel of the residual image 104 _(n+1) at the (x,y) location,

[0093] b(x,y) represents the value of the n^(th) base image level 103 _(n) at the (x,y) location, and

[0094] bi(x,y) represents the value of the interpolated base image level 174 at the (x,y) location.

[0095] From the previous equation, it is easy to see that the base digital image 103 _(n) can be formed from the base digital image 103 _(n+1) and the residual images 104 _(n+1) through a reconstruction process. The process includes the steps of interpolating the base digital image 103 _(n+1), then adding the residual digital image 104 _(n+1). By extension, the digital image 164 shown in FIG. 5 can be represented as a collection of residual images 104 _(1-N) and one base digital image 103 _(N). The perfect reconstruction of the digital image 164 is an iterative process where one base digital image and the corresponding residual image are used to create a higher resolution base digital image, which in turn is used with the next residual digital image to create an even higher resolution base digital image. In the preferred embodiment, N=8 pyramid levels are used in the pyramid representation. Perfect reconstruction of an image pyramid is well known.

[0096] An alternative of one iteration of the pedestal reconstructor 158 of FIG. 4 is shown in greater detail in FIG. 9. Referring again to FIG. 9, the pedestal signal n 122 _(n) is input to a frequency splitter 134. The frequency splitter 134 applies a spatial filter 116 to the pedestal signal, generating the lowpass signal 136. The spatial filter 116 is lowpass in nature, attenuating any high frequency content in the pedestal signal 122 _(n).

[0097] The lowpass signal 136 is input to the luma control signal generator 180 for generating the luma control signal 182. The luma control signal 182 has a value near 1.0 corresponding to “edge” regions in luminance digital image 107, and a value near 0 for other regions, with intermediate values.

[0098] The luma control signal 182 is interpolated by a factor Q by an interpolator 172, and multiplied by the residual image 104 _(n) by a multiplier 144, resulting in a signal that is zero in non-edge regions and maintains the values of the residual image 104 _(n) in edge regions. The resulting signal is added with an adder 132 to the signal resulting from interpolating with an interpolator 172 the pedestal signal n, forming the pedestal signal 122 _(n−1). Thus, the pedestal signal 122 _(n−1) is simply an interpolated version of the pedestal signal n, with the addition of residual image 104 _(n) content in edge regions.

[0099]FIG. 9 illustrates the pedestal level reconstructor 158 for creating a higher resolution pedestal signal 122 from a starting pedestal signal. By iterating this pedestal level reconstructor, a pedestal signal 122 can be generated from the pyramid image representation 108 of FIG. 4. In the preferred embodiment, the pyramid image representation 108 has 8 levels.

[0100]FIG. 10 illustrates the luma control signal generator 180 that is used to create the luma control signal 182 for use in generating the pedestal signal when reconstructing an image pyramid representation, as shown in FIG. 9. In accordance with FIG. 9, a luma control signal 182 _(n) is generated for each of the resolution levels during the pedestal reconstruction.

[0101] First, the non-directional gradient G of the lowpass signal 136 is calculated by the gradient calculator 150 and output as the gradient signal 152. This calculation is performed by calculating vertical and horizontal gradients with a spatial filter called a gradient filter.

[0102] Although a variety of different gradient filters can be used, the preferred embodiment of the present invention uses two Prewitt spatial filters to generate a vertical and a horizontal gradient value for each input pixel value given by Eq. (13) and (14) $\begin{matrix} \begin{matrix} {- 1} & 0 & 1 \\ {- 1} & 0 & 1 \\ {- 1} & 0 & 1 \end{matrix} & (13) \\ \begin{matrix} {- 1} & {- 1} & {- 1} \\ 0 & 0 & 0 \\ 1 & 1 & 1 \end{matrix} & (14) \end{matrix}$

[0103] respectively.

[0104] The non-directional gradient signal 152 is the square root of the sum of the squares of these two gradients. The control signal applicator 154 inputs the gradient signal 152 the chroma control signal 114n, and optionally the luma control signal 182 _(n+1) (the luma control signal previously calculated from the previous lower resolution) and produces the luma control signal 182 _(n). The values of the luma control signal 182 _(n) are in the range of 0 to 1, and are found by applying a gradient threshold that is dependent on both the value of the chroma control signal 114 _(n) and the luma control signal 182 _(n+1) to the gradient signal 152. The luma control signal 182 _(n) has a value of 1.0 corresponding to “edge” regions and a value of 0.0 corresponding to “detail” regions in the original digital image and intermediate values for regions that are not easily classified as “edge” or “detail.” The classification of “edge” is more difficult to attain (i.e. the gradient signal requirement is increased) in the luma control signal 182 _(n) where the corresponding location in the previous resolution luma control signal 182 _(n+1) was not classified as an “edge” (i.e. the pixel value was not 1.0). The operation of the control signal applicator 154 can be expressed as an equation: $\begin{matrix} {{{Lc}\left( {x,y} \right)} = \left\{ \begin{matrix} 1 & {when} & {{G\left( {x,y} \right)} < p} \\ 0 & {when} & {{G\left( {x,y} \right)} > {2p}} \\ {1.0 - \frac{{G\left( {x,y} \right)} - p}{p}} & {otherwise} & \quad \end{matrix} \right.} & (15) \end{matrix}$

[0105] where

[0106] Lc(x,y) is the value of the luma control signal 182 at the (x,y) location;

[0107] G(x,y) is the value of the gradient signal 152 at the (x,y) location;

[0108] p is a gradient threshold value that is dependent on the value of the luma control signal n+1 182 _(n+1) and is preferably: $\begin{matrix} {{p\left( {x,y} \right)} = \left\{ \begin{matrix} {{\max \left( {0,f} \right)} + {h\left( {g - {{Lo}\left( {{xx},{yy}} \right)}} \right)}} & {when} & {{{Lo}\left( {{xx},{yy}} \right)} < g} \\ {\max \left( {0,f} \right)} & {otherwise} & \quad \end{matrix} \right.} & (16) \end{matrix}$

[0109] where:

[0110] f is an arbitrary constant, preferably 68;

[0111] h is an arbitrary constant, preferably 1.4;

[0112] g is an arbitrary constant, preferably 0.4; and

[0113] Lo(xx,yy) is the value of the luma control signal n+1 at the location (xx,yy) that corresponds to the location (x,y) in the current resolution level n.

[0114] When Q=2, the value of xx=(x−½)/2 for example. The value of the luma control signal Lo(xx,yy) must be found by interpolation (preferably bilinear interpolation) when either xx or yy is not an integer. All other variables have been previously defined.

[0115] The present invention can be employed with any number of pyramid levels. Noise in images is generally a function of spatial resolution and is also more objectionable for the higher spatial resolution pyramid levels. The optimal number of pyramid levels depends on the texture removal goals of the digital imaging system designer and on the size of the digital images being processed. The preferred embodiment of the present invention uses 8 pyramid levels for effective texture and noise removal for digital images of size 1024 by 1536 pixels. For processing digital images of greater spatial resolution, such as 2048 by 3072 pixel, 9 pyramid levels are used. For processing digital images of lower spatial resolution, such as 512 by 768 pixels, 7 pyramid levels are used.

[0116] Rather than constructing the image pyramid with a pre-determined number of pyramid levels as described above, pyramid levels can be produced until a predetermined criteria is met. For example, pyramid levels can be produced until both image dimensions (number of rows and columns of pixels) are below 16 in the lower resolution base image level 103 _(n+1).

[0117] The method of the present invention can be performed in a digital camera or in a digital printer.

[0118] The invention has been described in detail with particular reference to certain preferred embodiments thereof, but it will be understood that variations and modifications can be effected within the spirit and scope of the invention.

Parts List

[0119]10 image capture device

[0120]20 digital image processor

[0121]30 image output device

[0122]40 general control computer

[0123]50 display device

[0124]60 input control device

[0125]70 offline memory device

[0126]101 original digital image

[0127]102 enhanced digital image

[0128]103 _(1-N) base image level

[0129]104 _(1-N) residual image

[0130]105 padded base image level

[0131]107 luminance digital image

[0132]108 image pyramid representation

[0133]109 chrominance digital image

[0134]112 chroma control signal generator

[0135]113 enhanced luminance digital image

[0136]114 chroma control signal

[0137]115 _(1-N) pyramid level module

[0138]116 spatial filter

[0139]120 pedestal generator

[0140]122 _(0-N) pedestal image

[0141]124 tone scale function applicator

[0142]126 modified pedestal image

[0143]128 texture generator

[0144]130 texture image

[0145]132 adder

[0146]134 frequency splitter

[0147]136 lowpass signal

[0148]144 multiplier

[0149]150 gradient calculator

[0150]152 gradient signal

[0151]154 control signal applicator

[0152]156 pyramid constructor

[0153]158 pedestal reconstructor

[0154]164 digital image

[0155]170 down sampler

[0156]172 interpolator

[0157]174 interpolated base image level

[0158]176 differencer

[0159]180 luma control signal generator

[0160]182 luma control signal

[0161]203 tone scale function

[0162]210 LCC conversion module

[0163]220 RGB conversion module

[0164]230 tone scale function generator

[0165]240 luminance enhancer 

What is claimed is:
 1. A method of processing an image to form an image pyramid having multiple image levels, comprising: a) receiving a base level image comprising pixel values at pixel locations arranged in rows and columns; b) determining sample locations for a next level image in the pyramid such that the sample locations are arranged in a regular pattern and the sample locations exceed the range of the pixel locations of the base level image; c) determining the pixel values of the next level image by interpolating the pixel values of the base level image using an interpolation filter at the sample locations; and d) treating the next level image as the base level image and repeating steps b) and c) until a predetermined number of pyramid image levels are generated, or until a predetermined condition is met.
 2. The method claimed in claim 1, further comprising generating a residual image having the same resolution as a base level image by interpolating a next level image to have the same resolution as the base level image, and calculating the difference between the base level image and the interpolated next level image.
 3. The method claimed in claim 1, wherein the sample locations preserve the phase of the base level image.
 4. The method claimed in claim 1, wherein the interpolation filter is a block average filter.
 5. The method claimed in claim 1, wherein interpolation filter employs reflected pixel values at the edges of the image.
 6. A method of processing a digital image, comprising: a) generating an image pyramid by: i) receiving a base level image comprising pixel values at pixel locations arranged in rows and columns; ii) determining sample locations for a next level image in the pyramid such that the sample locations are arranged in a regular pattern and the sample locations exceed the range of the pixel locations of the base level image; iii) determining the pixel values of the next level image by interpolating the pixel values of the base level image using an interpolation filter at the sample locations; iv) generating a residual image having the same resolution as a base level image by interpolating a next level image to have the same resolution as the base level image and calculating the difference between the base level image and the interpolated next level image; and v) treating the next level image as the base level image and repeating steps ii), iii) and iv) until a predetermined number of pyramid image levels are generated, or until a predetermined condition is met; b) generating a pedestal image from the image pyramid; c) applying a tone scale function to the pedestal image to produce a tone scale adjusted pedestal image; d) generating a texture image; and e) adding the texture image to the tone scale adjusted pedestal image to produce a tone scale adjusted digital image.
 7. The method claimed in claim 6, wherein the step of generating a pedestal image comprises: c1) modifying the residual images to reduce the magnitude of the residual images in regions corresponding to non-edge regions of the base image to produce modified residual images; and c2) generating the pedestal image by reconstructing the digital image using the modified residual images.
 8. The method claimed in claim 6, wherein the sample locations preserve the phase of the base level image.
 9. The method claimed in claim 6, wherein the interpolation filter employs reflected pixel values at the edges of the image.
 10. Apparatus for processing an image to form an image pyramid having multiple image levels, comprising: a) means for receiving a base level image comprising pixel values at pixel locations arranged in rows and columns; b) means for determining sample locations for a next level image in the pyramid such that the sample locations are arranged in a regular pattern and the sample locations exceed the range of the pixel locations of the base level image; c) means for determining the pixel values of the next level image by interpolating the pixel values of the base level image using an interpolation filter at the sample locations; and d) means for treating the next level image as the base level image and repeating steps b) and c) until a predetermined number of pyramid image levels are generated, or until a predetermined condition is met.
 11. The apparatus claimed in claim 10, further comprising means for generating a residual image having the same resolution as a base level image by interpolating a next level image to have the same resolution as the base level image, and for calculating the difference between the base level image and the interpolated next level image.
 12. The apparatus claimed in claim 10, wherein the sample locations preserve the phase of the base level image.
 13. The apparatus claimed in claim 10, wherein the interpolation filter is a block average filter.
 14. The apparatus claimed in claim 1, wherein the interpolation filter employs reflected pixel values at the edges of the image.
 15. An apparatus for processing a digital image, comprising: a) means for generating an image pyramid including: i) means for receiving a base level image comprising pixel values at pixel locations arranged in rows and columns; ii) means for determining sample locations for a next level image in the pyramid such that the sample locations are arranged in a regular pattern and the sample locations exceed the range of the pixel locations of the base level image; iii) means for determining the pixel values of the next level image by interpolating the pixel values of the base level image using an interpolation filter at the sample locations; iv) means for generating a residual image having the same resolution as a base level image by interpolating a next level image to have the same resolution as the base level image and calculating the difference between the base level image and the interpolated next level image; and v) means for treating the next level image as the base level image and repeating steps ii), iii) and iv) until a predetermined number of pyramid image levels are generated, or until a predetermined condition is met; b) means for generating a pedestal image from the image pyramid; c) means for applying a tone scale function to the pedestal image to produce a tone scale adjusted pedestal image; d) means for generating a texture image; and e) means for adding the texture image to the tone scale adjusted pedestal image to produce a tone scale adjusted digital image.
 16. The apparatus claimed in claim 15, wherein the means for generating a pedestal image comprises: c1) means for modifying the residual images to reduce the magnitude of the residual images in regions corresponding to non-edge regions of the base image to produce modified residual images; and c2) means for generating the pedestal image by reconstructing the digital image using the modified residual images.
 17. The apparatus claimed in claim 15, wherein the sample locations preserve the phase of the base level image.
 18. The apparatus claimed in claim 15, wherein interpolation filter employs reflected pixel values at the edges of the image.
 19. A software program product for performing the method of claim
 1. 20. A software program product for performing the method of claim
 15. 21. A digital image processed by the method of claim
 15. 